26 research outputs found

    A practical guide and software for analysing pairwise comparison experiments

    Get PDF
    Most popular strategies to capture subjective judgments from humans involve the construction of a unidimensional relative measurement scale, representing order preferences or judgments about a set of objects or conditions. This information is generally captured by means of direct scoring, either in the form of a Likert or cardinal scale, or by comparative judgments in pairs or sets. In this sense, the use of pairwise comparisons is becoming increasingly popular because of the simplicity of this experimental procedure. However, this strategy requires non-trivial data analysis to aggregate the comparison ranks into a quality scale and analyse the results, in order to take full advantage of the collected data. This paper explains the process of translating pairwise comparison data into a measurement scale, discusses the benefits and limitations of such scaling methods and introduces a publicly available software in Matlab. We improve on existing scaling methods by introducing outlier analysis, providing methods for computing confidence intervals and statistical testing and introducing a prior, which reduces estimation error when the number of observers is low. Most of our examples focus on image quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm

    Distilling Style from Image Pairs for Global Forward and Inverse Tone Mapping

    Full text link
    Many image enhancement or editing operations, such as forward and inverse tone mapping or color grading, do not have a unique solution, but instead a range of solutions, each representing a different style. Despite this, existing learning-based methods attempt to learn a unique mapping, disregarding this style. In this work, we show that information about the style can be distilled from collections of image pairs and encoded into a 2- or 3-dimensional vector. This gives us not only an efficient representation but also an interpretable latent space for editing the image style. We represent the global color mapping between a pair of images as a custom normalizing flow, conditioned on a polynomial basis of the pixel color. We show that such a network is more effective than PCA or VAE at encoding image style in low-dimensional space and lets us obtain an accuracy close to 40 dB, which is about 7-10 dB improvement over the state-of-the-art methods.Comment: Published in European Conference on Visual Media Production (CVMP '22

    Single-frame Regularization for Temporally Stable CNNs

    Get PDF
    Convolutional neural networks (CNNs) can model complicated non-linear relations between images. However, they are notoriously sensitive to small changes in the input. Most CNNs trained to describe image-to-image mappings generate temporally unstable results when applied to video sequences, leading to flickering artifacts and other inconsistencies over time. In order to use CNNs for video material, previous methods have relied on estimating dense frame-to-frame motion information (optical flow) in the training and/or the inference phase, or by exploring recurrent learning structures. We take a different approach to the problem, posing temporal stability as a regularization of the cost function. The regularization is formulated to account for different types of motion that can occur between frames, so that temporally stable CNNs can be trained without the need for video material or expensive motion estimation. The training can be performed as a fine-tuning operation, without architectural modifications of the CNN. Our evaluation shows that the training strategy leads to large improvements in temporal smoothness. Moreover, for small datasets the regularization can help in boosting the generalization performance to a much larger extent than what is possible with na\"ive augmentation strategies

    HDR-VDP-3: A multi-metric for predicting image differences, quality and contrast distortions in high dynamic range and regular content

    Full text link
    High-Dynamic-Range Visual-Difference-Predictor version 3, or HDR-VDP-3, is a visual metric that can fulfill several tasks, such as full-reference image/video quality assessment, prediction of visual differences between a pair of images, or prediction of contrast distortions. Here we present a high-level overview of the metric, position it with respect to related work, explain the main differences compared to version 2.2, and describe how the metric was adapted for the HDR Video Quality Measurement Grand Challenge 2023

    A Model of Local Adaptation

    Get PDF
    The visual system constantly adapts to different luminance levels when viewing natural scenes. The state of visual adaptation is the key parameter in many visual models. While the time-course of such adaptation is well understood, there is little known about the spatial pooling that drives the adaptation signal. In this work we propose a new empirical model of local adaptation, that predicts how the adaptation signal is integrated in the retina. The model is based on psychophysical measurements on a high dynamic range (HDR) display. We employ a novel approach to model discovery, in which the experimental stimuli are optimized to find the most predictive model. The model can be used to predict the steady state of adaptation, but also conservative estimates of the visibility(detection) thresholds in complex images.We demonstrate the utility of the model in several applications, such as perceptual error bounds for physically based rendering, determining the backlight resolution for HDR displays, measuring the maximum visible dynamic range in natural scenes, simulation of afterimages, and gaze-dependent tone mapping

    High Dynamic Range Imaging Technology.

    Get PDF
    Abstract: In this lecture note, we describe high dynamic range (HDR) imaging systems. Such systems are able to represent luminances of much larger brightness and, typically, a larger range of colors than conventional standard dynamic range (SDR) imaging systems. The larger luminance range greatly improves the overall quality of visual content, making it appear much more realistic and appealing to observers. HDR is one of the key technologies in the future imaging pipeline, which will change the way the digital visual content is represented and manipulated today

    Depth from HDR: Depth Induction or Increased Realism?

    Get PDF
    Many people who first see a high dynamic range (HDR) display get the impression that it is a 3D display, even though it does not produce any binocular depth cues. Possible explanations of this effect include contrast-based depth induction and the increased re-alism due to the high brightness and contrast that makes an HDR display “like looking through a window”. In this paper we test both of these hypotheses by comparing the HDR depth illusion to real binocular depth cues using a carefully calibrated HDR stereo-scope. We confirm that contrast-based depth induction exists, but it is a vanishingly weak depth cue compared to binocular depth cues. We also demonstrate that for some observers, the increased con-trast of HDR displays indeed increases the realism. However, it is highly observer-dependent whether reduced, physically correct, or exaggerated contrast is perceived as most realistic, even in the pres-ence of the real-world reference scene. Similarly, observers differ in whether reduced, physically correct, or exaggerated stereo 3D is perceived as more realistic. To accommodate the binocular depth perception and realism concept of most observers, display technolo-gies must offer both HDR contrast and stereo personalization
    corecore